# Wav2Vec2 Fine-tuning
Finvoc2vec
A voice tone classifier specifically designed for corporate disclosure scenarios, based on a two-phase training of the Wav2Vec2 architecture
Audio Classification
Transformers English

F
waiv
17
1
Music Classifier
Audio classification model based on Wav2Vec2 for music genre recognition
Audio Classification
Safetensors
M
gastonduault
478
2
Speech Emotion Recognition With Facebook Wav2vec2 Large Xlsr 53
Apache-2.0
A speech emotion recognition system fine-tuned on Wav2Vec2 Large XLSR-53 model, capable of identifying 7 common emotions
Audio Classification
Transformers

S
firdhokk
66
0
Wav2vec2 Xlsr English Speech Emotion Recognition
This model is used to recognize six basic emotions from English audio: anger, disgust, fear, happiness, sadness, and surprise, trained on the RAVDESS dataset.
Audio Classification
Transformers English

W
AreejB
82
0
Wav2vec2 Large Robust 6 Ft Age Gender
This model, fine-tuned from Wav2Vec2-Large-Robust, can predict the speaker's age and gender from raw audio.
Audio Classification
Transformers

W
audeering
19.29k
2
Englishmodel
Apache-2.0
This model is a fine-tuned speech recognition model based on facebook/wav2vec2-xls-r-300m, primarily used for English speech-to-text tasks.
Speech Recognition
Transformers

E
Foxasdf
24
1
Wav2vec2 Xls R 300m En Atc Uwb Atcc And Atcosim
Apache-2.0
An air traffic control speech recognition model fine-tuned on wav2vec2-xls-r-300m, supporting English
Speech Recognition
Transformers English

W
Jzuluaga
37
7
Malaya Speech Fine Tune Realcase 22 Jun
This model is a speech recognition model fine-tuned on the Singapore English (uob_singlish) dataset based on wav2vec2-xls-r-300m-mixed
Speech Recognition
Transformers

M
RuiqianLi
20
0
Project NLP
Apache-2.0
A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate (WER) of 0.3355 on the evaluation set.
Speech Recognition
Transformers

P
zakria
22
0
Wav2vec2 Xls R 300m Timit Phoneme
Apache-2.0
This is an automatic phoneme recognition model fine-tuned on the TIMIT dataset based on the facebook/wav2vec2-xls-r-300m model, primarily used for phoneme-level recognition of English speech.
Speech Recognition
Transformers English

W
vitouphy
8,457
29
Wav2vec2 Base Timit Demo Colab11
Apache-2.0
This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, achieving a word error rate of 0.4348 on the TIMIT dataset.
Speech Recognition
Transformers

W
sameearif88
18
0
SSL Harveen Chadda Fine Tuning
MIT
This model is a fine-tuned speech recognition model based on Harveenchadha/vakyansh-wav2vec2-hindi-him-4200 on an unknown dataset, supporting Hindi and achieving a word error rate of 10.08% on the evaluation set.
Speech Recognition
Transformers

S
rajat99
30
0
Wav2vec2 Large Robust 12 Ft Emotion Msp Dim
This model is fine-tuned from Wav2Vec2-Large-Robust for speech emotion recognition, predicting values in three dimensions: arousal, dominance, and valence.
Audio Classification
Transformers English

W
audeering
394.51k
109
Finetune Indian Asr
Indian English speech recognition model fine-tuned based on Harveenchadha/vakyansh-wav2vec2-indian-english-enm-700
Speech Recognition
Transformers

F
Simply-divine
20
1
Daniel Asr
Apache-2.0
An automatic speech recognition (ASR) model fine-tuned from facebook/wav2vec2-base, achieving a word error rate of 0.3423 on the evaluation set
Speech Recognition
Transformers

D
danielbubiola
25
0
Wav2vec2 Large Xlsr 53 Turkish
Apache-2.0
This is an automatic speech recognition (ASR) model fine-tuned on the Turkish Common Voice dataset based on Facebook's wav2vec2-large-xlsr-53 model.
Speech Recognition Other
W
ceyda
54
1
English ASR
Apache-2.0
This model is a fine-tuned English Automatic Speech Recognition (ASR) model based on facebook/wav2vec2-base, achieving a word error rate of 0.3397 on the evaluation set.
Speech Recognition
Transformers

E
maher13
13
0
Wav2vec2 Base Timit Asr
Apache-2.0
A speech recognition model fine-tuned on the timit_asr dataset based on facebook/wav2vec2-base, supporting 16kHz sampled audio input
Speech Recognition
Transformers English

W
elgeish
174
0
Wav2vec Test
Apache-2.0
A fine-tuned Egyptian Arabic automatic speech recognition model based on facebook/wav2vec2-large-xlsr-53, trained using the arabicspeech.org MGB-3 dataset.
Speech Recognition
Transformers Arabic

W
othrif
27
0
Bp500 Base10k Voxpopuli
Apache-2.0
This is a Wav2vec 2.0 speech recognition model optimized for Brazilian Portuguese, fine-tuned on multiple Brazilian Portuguese datasets
Speech Recognition
Transformers Other

B
lgris
23
0
Wav2vec2 Large Xls R 300m Ha Cv8
Apache-2.0
A Hausa speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition
Transformers Other

W
anuragshas
17
1
Bp Commonvoice10 Xlsr
Apache-2.0
Wav2vec 2.0 model fine-tuned for Brazilian Portuguese speech recognition based on Common Voice 7.0 dataset
Speech Recognition
Transformers Other

B
lgris
25
0
Bp Cetuc100 Xlsr
Apache-2.0
Wav2vec2 model fine-tuned for Brazilian Portuguese using the CETUC dataset, trained with approximately 145 hours of Brazilian Portuguese speech data
Speech Recognition
Transformers Other

B
lgris
22
0
HIYACCENT Wav2Vec2
HIYACCENT is a speech recognition system optimized for Nigerian English accents, built upon an enhanced Wav2Vec2 architecture with over 20% performance improvement.
Speech Recognition
Transformers

H
codeceejay
27
1
Bp Sid10 Xlsr
Apache-2.0
This is a Wav2vec 2.0 model fine-tuned for Brazilian Portuguese, trained using the Sidney dataset, suitable for automatic speech recognition tasks in Brazilian Portuguese.
Speech Recognition
Transformers Other

B
lgris
21
0
Bp Commonvoice100 Xlsr
Apache-2.0
This is a Wav2vec 2.0 model fine-tuned for Brazilian Portuguese, trained on the Common Voice 7.0 dataset, supporting Portuguese speech recognition tasks.
Speech Recognition
Transformers Other

B
lgris
21
0
German Trained
Apache-2.0
This model is a fine-tuned German speech recognition model based on flozi00/wav2vec-xlsr-german, primarily used for German speech-to-text tasks.
Speech Recognition
Transformers

G
chaitanya97
24
0
Wav2vec2 Xls R 300m Arabic
Apache-2.0
This is an automatic speech recognition (ASR) model fine-tuned on the Arabic Common Voice 7 dataset based on the facebook/wav2vec2-xls-r-300m model.
Speech Recognition
Transformers Arabic

W
AndrewMcDowell
148
0
Featured Recommended AI Models